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ABSTRACT 

The  Geometrically  Intrinsic  Nonlinear  Recursive  Filter,  or  Gl  Filter,  is  designed  to  estimate  an  arbitrary 
continuous-time  Markov  diffusion  process  X  subject  to  nonlinear  discrete-time  observations.  The  Gl 
Filter  is  fundamentally  different  from  the  much-used  Extended  Kalman  Filter  (EKF),  and  its  second- 
order  variants,  even  in  the  simplest  nonlinear  case,  in  that: 

•  It  uses  a  quadratic  function  of  a  vector  observation  to  update  the  state,  instead  of  the  linear  func¬ 
tion  used  by  the  EKF. 

•  It  is  based  on  deeper  geometric  principles,  which  make  the  Gl  Filter  coordinate-invariant.  This 
implies,  for  example,  that  if  a  linear  system  were  subjected  to  a  nonlinear  transformation  tot  the 
state-space  and  analyzed  using  the  Gl  Filter,  the  resulting  state  estimates  and  conditional  vari¬ 
ances  would  be  the  push-forward  under  f  of  the  Kalman  Filter  estimates  for  the  untransformed 
system  -  a  property  which  is  not  shared  by  the  EKF  or  its  second-order  variants. 

The  noise  covariance  of  X  and  the  observation  covariance  themselves  induce  geometries  on  state 
space  and  observation  space,  respectively,  and  associated  canonical  connections.  A  sequel  to  this 
paper  develops  stochastic  differential  geometry  results  -  based  on  "intrinsic  location  parameters",  a 
notion  derived  from  the  heat  flow  of  harmonic  mappings  -  from  which  we  derive  the  coordinate-free 
filter  update  formula.  The  present  article  presents  the  algorithm  with  reference  to  a  specific  example  - 
the  problem  of  tracking  and  intercepting  a  target,  using  sensors  based  on  a  moving  missile.  Compu¬ 
tational  experiments  show  that,  when  the  observation  function  is  highly  nonlinear,  there  exist  choices 
of  the  noise  parameters  at  which  the  Gl  Filter  significantly  outperforms  the  EKF. 
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Background  on  Nonlinear  Filtering 


1  Background  on  Nonlinear  Filtering 

1.1  Example:  A  Nonlinear  Filtering  Problem  in  Target  Tracking 

D'Souza,  McClure,  and  Cloutier  [8],  [9]  consider  the  following  tactical  air-to-air  missile  intercept  prob¬ 
lem.  The  state  of  the  target  is  represented  by  a  position,  velocity,  and  acceleration  in  space,  making 
nine  dimensions  in  all  (the  authors  also  model  three  time  constants  as  state  variables).  Data  consists 
of  a  sequence  of  noisy  observations  of:  range,  angle  from  vertical,  azimuth,  and  range-rate,  all  mea¬ 
sured  from  a  missile  with  known  position,  velocity,  and  acceleration.  The  goal  of  filtering  in  this  case  is 
to  provide  a  sequence  of  "good"  estimates  of  the  state  of  the  target,  based  on  all  measurements  so 
far,  so  as  to  defeat  the  target's  possible  evasive  maneuvers  and  intercept  it. 

D'Souza  et  al  [8]  point  out  that,  although  the  state  dynamics  can  be  modeled  linearly,  the  observa¬ 
tions  are  a  highly  nonlinear  function  of  the  state  (see  Section  2. 7. a).  Alternatively,  if  a  spherical  coor¬ 
dinate  frame,  based  on  the  missile,  is  used,  then  observations  are  linear,  but  the  state  dynamics  are 
highly  nonlinear.  Moreover  the  Extended  Kalman  Filter,  and  standard  second-order  filters,  will  give  a 
different  set  of  answers  in  the  Cartesian  coordinate  frame  than  in  the  spherical  one,  because  they  are 
"non-intrinsic",  i.e.  lacking  in  absolute  geometric  meaning. 

1 .2  Drawbacks  of  Current  Approaches 

1.2.  a  The  Infinite-Dimensional  Approach 

The  standard  mathematical  presentation  of  the  nonlinear  filtering  problem,  as  can  be  seen  for  exam¬ 
ple  in  Lipster  and  Shiryaev  [12],  and  Pardoux  [14],  involves  a  measure-valued  SDE  called  the  Zakai 
equation  (or  the  Fujisaki-Kallianpur-Kunita  formula).  This  is  virtually  never  used  in  real-time  filtering 
applications  because  it  is  impossible  to  store  enough  data  to  update  an  infinite-dimensional  SDE 
(although  see  Lototsky,  Mikulevicius,  and  Rozovskii  [13]  fora  computational  method  using  a  Wiener 
chaos  expansion). 

1.2. b  Finite-Dimensional  Filters 

Under  certain  circumstances,  the  conditional  law  can  be  described  using  a  finite  set  of  parameters. 
Although  this  topic  is  outside  the  scope  of  this  article,  an  account  of  recent  progress  using  geometric 
methods  can  be  found  in  Cohen  de  Lara  [3].  Apart  from  the  Kalman  filter,  these  methods  are  not 
widely  used  in  practice,  since  the  parameters  may  be  difficult  to  determine  in  theory,  large  in  number, 
and  difficult  to  update  computationally. 

1 .2. c  The  Extended  Kalman  Filter  and  Second-order  Filters. 

Linearizing  the  state  and  observation  about  the  most  recent  state  estimate,  and  then  applying  the 
Kalman  Filter,  gives  the  Extended  Kalman  Filter;  see  Jazwinski  [1 1]  and  Bar-Shalom  and  Fortmann 
[1],  The  goal  here  is  no  longer  to  describe  the  full  conditional  distribution  of  the  state  given  the  obser¬ 
vations,  but  merely  to  approximate  the  conditional  expectation  and  the  conditional  covariance.  As 
mentioned  above,  the  output  is  coordinate-dependent.  A  careful  asymptotic  analysis  of  this  and  other 
approximation  schemes  has  been  given  by  Picard  [15]  -  see  also  references  therein. 
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1 .3  Desirable  Properties  of  a  Nonlinear  Filtering  Algorithm 

1.3.  a  State  Evolves  Continuously,  Observations  are  Discrete 

The  state  dynamics  (for  example,  the  dynamics  of  an  aircraft)  should  be  modeled  by  a  stochastic  pro¬ 
cess  {X  t  >  0}  in  continuous  time,  on  a  differentiable  manifold  N.  However  since  digital  implemen¬ 
tation  of  a  filtering  algorithm  is  carried  out  using  discrete-time  observations,  the  filter  should  involve 
observations  Yr  Y2,  ...  collected  at  discrete  times  tl<t1<  ...  on  another  manifold  M. 

1 .3. b  State  Estimates  Should  Not  Be  Coordinate-Dependent 

Let  {Xt(1),t>0}  and  {Xt<-),t>0}  be  representations  of  {X  t>0}  in  two  coordinate  systems, 
where  Xt<_)  =  (])  (Xf* ^ ).  Likewise  let  Y^1*,  Y^1*,  ...  and  y|2\  ...  be  representations  of 

Yj,  Yv  ...  in  two  coordinate  systems.  We  require  that  our  state  estimate  of  X  ,  given 
{Y|“\  ....  Y^“* }  ,  be  the  image  under  (])  of  our  state  estimate  of  xj  ^  ,  given  {  Y^1*,  ...,  Y^  }  . 
Notice  carefully  that  this  criterion  excludes  use  the  conditional  expectation  E^Xt(1>  Yj  *  *,  ...,  Y^'J 
as  the  state  estimate,  because  it  does  not  have  this  kind  of  invariance.  The  replacement  of  conditional 
expectation  by  an  "intrinsic  location  parameter"  is  the  main  theoretical  contribution  of  this  work. 

1 .3. c  Must  Coincide  with  the  Kalman  Filter  in  the  Linear  Case 

When  {Xft>0}  is  a  continuous-time  Gaussian  process,  and  Yn  is  a  linear  function  of  X  with 

Ln 

additive  Gaussian  noise,  our  filtering  algorithm  must  give  the  Kalman  filter  state  estimates  (which  fully 
describe  the  conditional  distribution  of  the  state,  given  the  observations,  in  such  a  context.) 

1 .3. d  Optimality  up  to  Some  Order 

2 

When  the  noise  covariance  of  X,  and  the  observation  covariance  are  taken  to  be  O  (y  )  ,  where 

9 

8  =  y  is  the  time  interval  between  observations,  we  are  seeking  an  algorithm  which  is  optimal  up  to 

3 

O  (y  )  ,  in  a  sense  to  be  made  precise  later. 

1.3. e  Stability 

The  important  issue  of  stability  will  not  be  studied  here.  For  results  on  the  stability  of  the  EKF,  see 
Bossanne  et  al  [2]  and  Deza  et  al  [7]. 

2  The  Nonlinear  Model  and  its  Induced  Geometry 

The  geometric  ideas  in  this  section  may  be  unfamiliar  to  filtering  theorists,  so  we  shall  illustrate  them 
with  reference  to  the  specific  example  of  Section  1.1. 

2.1  The  State  Process 

Consider  a  (possibly  degenerate)  Markov  diffusion  process  {X  0  <  t  <  8}  on  N  =  Rp ,  written  in  local 
coordinates  as 

P 

dX't  =  b'(Xt)dt+  £a!(Xf)dW;,  /  -  1,  2,  ....  p , 

/  =  l 


(1) 
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where  Vb'^r—  is  a  vector  field  on  Rp ,  a  (x)  =  (o’  (x) )  e  L  (Rp;TxRp)  ,  and  W  is  a  Wiener  process 
C/X  •  j  j  3 

Rp  .We  assume  for  simplicity  that  the  coefficients  b  ,  a.  are  C  with  bounded  first  derivative. 


2.2  Geometry  Induced  by  the  State  Process 

2 

Such  a  o  induces  a  C“  semi-definite  metric  (.|.)  on  the  cotangent  bundle,  which  we  call  the  diffusion 
variance  semi-definite  metric,  by  the  formula 


(dx'\dxk)x  =  (o  •  o  (x) )  'k  =  Ya]  M  Gj  (x)  •  (2) 

/  =  l 

This  semi-definite  metric  is  actually  intrinsic:  changing  coordinates  for  the  diffusion  will  give  a  differ¬ 
ent  matrix  (o’.)  ,  but  the  same  semi-definite  metric.  The  pxp  matrix  ( (o  •  o)  ’;)  defined  above 
induces  a  linear  transformation  a  (x):T  *N— >T  N,  i.e.  from  the  cotangent  space  to  the  tangent 
space  atx,  namely 


a  (x)  ( dx )  =  £  (o  •  o)  ''d/dx. .  (3) 

Let  us  make  a  constant-rank  assumption,  i.e.  that  there  exists  a  rank  r  vector  bundle  E-4N,a  sub¬ 
bundle  of  the  tangent  bundle,  such  that  Ex  =  range  (o  (x) )  c  TxN  for  all  x  e  N  .  Darling  [5]  pre¬ 
sents  a  global  geometric  construction  of  a  canonical  sub-Riemannian  connection  V°  for  (,|.),  with 
respect  to  a  generalized  inverse  g,  i.e.  a  vector  bundle  isomorphism  g:TN  — >  T*N  such  that 

a(x)  •  g  (x)  «a(x)  =  a(x)  .  (4) 

In  local  coordinates,  g  (x)  is  expressed  by  a  Riemannian  metric  tensor  (g  )  ,  such  that  if 
a'  =  (a  ■  a)  \  then 


L  ir  s  j  ij 

a  grs  a  =  a  . 

r,  s 

The  local  connector  r  (x)  e  L  ( 7"xRp  ®  TxRp;TxRp)  for  V°  can  be  written  as: 

2g(T(x)  (u®v))  -w  =  D(g(v)|g(w))(u)  +  D{g  (w)  |g  (u))  (v)  -  D(g  (u)  \g  (v))  (w)  , 


(5) 


(6) 


where  g  (T  (x)  (u  ®  v) )  is  a  1  -form,  acting  on  the  tangent  vector  w.  This  formula  coincides  with  the 
formula  for  the  Levi-Civita  connection  in  the  case  where  (.|.)  is  non-degenerate;  for  more  details,  see 
Darling  [5].  Our  connection  V°  gives  rise  to  notions  of  geometry  such  as  geodesics,  parallelism, 
covariant  differentiation,  exponential  map,  and  curvature,  as  explained  in  texts  such  as  Darling  [4], 
Sakai  [16].  We  assert: 


Axiom  A:  The  appropriate  geometry  for  the  state  process  is  the  one  induced  by  the  diffusion  vari¬ 
ance  semi-definite  metric. 


2.3  Intrinsic  Description  of  the  Process 

The  intrinsic  version  of  (1 )  is  to  describe  X  as  a  diffusion  process  on  the  manifold  N  with  generator 
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L  =  ^+-A  (7) 

where  A  is  the  (possibly  degenerate)  Laplace-Beltrami  operator  associated  with  the  diffusion  variance, 
and  £,  is  a  vector  field,  whose  expressions  in  the  local  coordinate  system  {x\  ...,xp}  are  as  follows: 

a  =  1(0-0) {D..  -  £r k.Dk}  ,  %  =  £  ibk  +  ^  (o  •  a)  'VrJ}  Dfc .  (8) 

i,  j  k  k  i,  j 

k  'I 

Note  that  ^r;.  (a  •  a)  has  been  specified  by  (6). 

2.4  Target  Tracking  Example 
2. 4. a  State  Process 

3  3  3 

The  state  x  consists  of  a  column  vector  whose  components  (p,  v,  a)  e  R  xR  xR  are  respectively 
the  location,  velocity,  and  acceleration  of  the  target  in  three-dimensional  space.  We  model  the  accel¬ 
eration  as  an  Ornstein-Uhlenbeck  process,  with  the  constraint  that  acceleration  must  be  perpendicu- 

2 

lar  to  velocity,  i.e.  v-a  =  0 ,  or  equivalently  that  ||v||  is  a  constant.  The  (v,  a)  components  can  be 

2  6 

considered  as  taking  values  in  the  four-dimensional  manifold  TS“  <z  R  . 

Thus  within  a  Cartesian  frame,  the  equations  of  motion  take  the  nonlinear  form: 

dp  0  I  Op  0 

dv  =  0  0  I  v  dt+  0  /  W 

_cfaj  [o  -P  M 1  ~hp  (y)J  L0J  Lyp  (y)  dW  (f)_ 

where  the  square  matrix  consists  of  nine  3x3  matrices,  A,  is  a  positive  time  constant,  y  =  ^Acj  deter¬ 
mines  the  noise  intensity, 

p(x)=||a||2/M2,  (10) 


P(v)e/-^£L(R3;R3), 


and  W  is  a  three-dimensional  Wiener  process.  Note  that  P  (v)  is  precisely  the  projection  onto  the 

3 

orthogonal  complement  of  v  in  R  ,  and  p  (x)  has  been  chosen  so  that  cf  (v  •  a)  =  0  .  (D'Souza  et  al 

[9]  describe  a  procedure  for  estimating  A,  but  in  our  simulations  we  assign  to  it  a  predetermined 

2 

value.)  The  constancy  of  ||v||  implies  that 


DP  (v)L 


— ^{CvvT  +  vCvT} 

iivir 


2.4.b  Geometry  Induced  by  the  State  Process 

2 

The  diffusion  variance  metric  (2)  is  degenerate  here;  noting  that  P“  =  P,  we  find 
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0  0  0 

a  =  a • a=  0  0  0  ,  (13) 

0  0  y2P  (v) 

-1  9 

where  0  denotes  03x3  .The  rescaled  Euclidean  metric  g  =  y  lg  on  R  is  a  generalized  inverse  to  a 

2 

in  the  sense  of  (4),  because  P  =  P  .  In  Section  5  of  [5]  we  show  in  more  defail  fhat  the  correspond¬ 
ing  local  connector  r  (x)  as  in  (6)  is  given  by 


r(x)(C®q)  =  S(C®f  v,  S(C®q)^  C .  (14) 

2M  T_  T 

9 

where  a  tangent  vector  ^  to  R  is  broken  down  into  three  3-dimensional  components  i^p,  C,y,  C,a  .  Note 


2 

0 

0 

r(x)  (a  •  a  (x) )  =  \ 

P(v) 

V  = 

0 

llvll 

0 

0 

2.4.c  The  Intrinsic  Vector  Field 

It  follows  from  (8),  (9),  and  (15)  thaf  the  formula  for  the  intrinsic  vector  field  S,  is: 


9(x)  = 


p  (x)  v  -  A.P  (v)  a 


Differentiate  under  the  assumptions  ||v||“  is  constant  and  v  •  a  =  0,  to  obtain 

Tn  /  n  1  t 


0  / 

D^(x)  (0  =  0  0 


/  >Q  = 


Lo  ?iQ-p  -IP-2Q]  £ 


In  Section  5  of  [5],  we  show  that,  if  we  write  a  symmetric  tensor  %  e  TxN  ®  T^N  in  3  x  3  blocks  as 
the  matrix 


^pp  ^pv  ^pa 

Xvp  Zvv  Zva  ' 
Zap  Zay  Zaa 


where  y  =  Z  ,  then 

A-qv  A/ vq  • 
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0 

D\(x)  (X)  =  —2  0  (18) 

Hvll  Tr(y  )  v  +  (x  +X  )  a 

VA/oa/  VA,vo  A'av/ 

2.4.d  Curvature  of  the  State  Space 

The  curvature  tensor  is  given  by  the  formula  (omitting  x): 

R(q,ti)C  =  or(ri)  (C®s) -DT(q)  (£®ti)  +  r(r(C®q)  ®T1)  -r(r(C®r|)  ®g  ,  <i9) 

2 

and  this  may  easily  be  computed  from  (14),  noting  that,  for  example,  since  ||v||  is  constant, 

Dren)  (C ®s)  =  s (^  ®  q) r)v/ ( 2. || v|| 2) . 


2.5  Covariance  Tensor  of  a  Random  Variable  in  a  Riemannian  Manifold 

We  now  introduce  a  local  covariance  concept,  which  we  use  for  describing  the  uncertainty  in  the  state 
estimates.  Suppose  N  is  any  manifold  with  a  torsion-free  connection,  p  e  N,  and  exp  is  the  expo- 
nential  map  from  T^N  to  N.  Suppose  S  is  a  random  variable  with  values  in  N  (we  assume  that  S  takes 
values  in  the  image  of  the  set  on  which  exp^  is  injective),  and  E  is  a  symmetric  element  of 

2. 5. a  Definition 

S  will  be  said  to  centered  at  p  with  co  variance  tensor  E  if  r|  =  exp' S  satisfies  E  [r|]  =  0  e  T  N ,  and 
for  any  cotangent  vectors  0  and  X  at  p. 


E  [  (0  ■  rj)  an)]  =  (0®X)  •  E  . 

In  more  concrete  terms,  if  {e,,...,e  }  is  some  basis  of  T  N ,  and 

1  P 


(20) 


E  =  £E'7e.®e/( 
id 

then  (E<7)  is  the  covariance  matrix  of  the  random  vector  (p\  •••,  nP)  defined  by  exp^^S 

2.6  The  Observation  Covariance  Metric 


In  our  model,  the  observation  Yn  will  be  the  image  under  the  exponential  map  of  a  zero-mean  ran¬ 
dom  variable  V  in  the  tangent  space  at  \\t  (X  )  .  Thus  when  \\t  (Xf  )  =  y ,  Yn  is  centered  at  y  with 

n  n 

covariance  tensor  (3  (y)  ,  a  non-degenerate  symmetric  tensor  in  TyM  ®  TyM  .  Provided  y  —>  (3  (y)  is 
sufficiently  regular,  it  serves  as  the  metric  tensor  for  a  metric  (.|.)o  on  the  cotangent  bundle  of  M, 
called  the  observation  covariance  metric,  namely 


(dy'\dy')0  =  (3''. 


We  assert: 
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Axiom  B:  The  appropriate  metric  for  the  observation  space  is  the  observation  covariance  metric, 
not  the  Euclidean  metric. 

We  denote  by  (g°)  the  metric  tensor  on  TM ,  inverse  to  (p^)  ,  and  by  g°  the  associated  Rieman- 
nian  metric.  The  Levi-Civita  connection  V  on  M  has  local  connector  T(.)  ,  computed  as  follows: 


-m  -1  J  o  dp''"1  ,  o  dp'<m  ,  d3ij  mk 


2.7  Target  Tracking  Example,  Continued 

2. 7.  a  Observation  Function 

The  observables  are  respectively:  range,  angle  from  vertical,  azimuth,  and  range-rate  (all  measured 
from  a  missile  wifh  known  sfafe  ( pM ,  vM,  aM)  )  and  a  fictitious  measurement;  the  latter  is  a  zero- 
mean  Gaussian  random  variable  representing  a  fictitious  observation  of  the  inner  product  of  velocity 
and  acceleration  of  the  target,  which  according  to  our  model  should  be  zero.  Take 
\\r.R3  x  R3  x  R3  -4  R  +xS2  x  R2  to  be: 

1  2  3  4  5 

V(P>  v.  a)  =  (<J>(p-pM),  |v-vM|,a  •  v)  =  ( (y  ,y  ,y  ),y  ,y  )  (22; 

where  <1 >  =  h  1 ,  and  h  is  the  spherical  coordinate  transformation 

h  (r,  0,  ( ]))  =  (rsin0cos(|),  rsin 0 sin (f),  rcos0)  .  (23; 

For  the  sake  of  brevity,  we  omit  here  the  calculations  of  the  first  and  second  derivatives  of  \| r. 

2.7.  b  Observation  Covariance  Metric 

The  covariance  matrix  for  the  five  observed  quantities  is  taken  to  be  of  the  form 


2  si  s3  22 

P  (y)  —  diag  r"s0, -+s2, -+s4,  r  s5.  Op  =  diag  (hp  h2,  hy  h4,  h5)  , 


where  r  =  y  denotes  range,  and  other  quantities  are  constants.  Then 

fy  =  51(.diag(h1/,h2kh3/)h4',0)  , 
and  with  g°  (y)  =  diag  (h/,  h3 ,  h3\ h 4\  h^)  , 


dg°  s  „  „  ,  (hl  h2  h3  h4 

y  =-5lfcK,K.d,ag  ,0  . 

yk  <hl  h2  h3  h4  ) 


Now  (21 )  gives 


r'V  “  llVik  2hl_5i/5/fch.  _5i fikh. 

I  n;  i  i 
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2.8  Summary:  the  Model  in  Intrinsic  Terms 

Following  the  discussion  above,  we  can  rephrase  the  nonlinear  filtering  model  in  an  abstract  way. 

2. 8.  a  Model 

The  model  consists  of: 

•  A  manifold  N,  called  the  state  space,  a  canonical  sub-Riemannian  connection  V°  induced 
by  a  diffusion  variance  semi-definife  metric  (,|.)  on  T*N ,  and  a  vector  field  £,  on  N;  these 
serve  to  define  the  generator  L  =  £  +  ^ A  of  a  diffusion  process  X  on  N; 

•  A  Riemannian  manifold  (M,  g°)  ,  called  the  observation  space,  and  the  Levi-Civita  connec¬ 
tion  V  induced  by  g°  . 

3 

•  AC  function  \\r:N  — >  M ,  called  the  observation  function. 

2.8. b  Data 

Data  consist  of: 

•  A  point  )10  e  N ,  called  the  initial  state  estimate; 

•  En  e  T  N  ®  T  N ,  the  covariance  tensor  of  the  initial  state  estimate; 

M-o 

•  A  sequence  of  times  0  <  f  l  <  t2  <  . . . ,  and  for  each  n  >  1  a  noisy  observation  Yn  of  \|/  (X  ) 

n 

(in  the  sense  of  paragraph  2.6). 

2.8. c  Goal 

The  goal  is  to  construct  a  sequence  of  state  and  covariance  estimates  ( p.n,  tn)  for  the  state  process, 
n  =  1,  2,  ... ,  with  the  following  two  properties: 

•  For  a  linear  system  subject  to  invertible  smooth  non-linear  transformations,  our  estimates 
should  be  the  transforms  of  the  Kalman  filter  estimates. 

•  The  construction  of  (pn,  En)  is  intrinsic  -  i.e.  unaffected  by  choice  of  coordinates  -  and  opti- 

4 

mal  up  to  O  (y  )  ,  where  y  is  a  noise  intensity  parameter. 
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3  Intrinsic  Geometric  Quantities  Associated  with  the  Model 

The  following  calculations  will  apply  to  every  time  interval  [t  r  t  ]  ,  but  for  notational  simplicity  we 

shall  treat  only  use  a  time  interval  [0,  8]  .  Let  us  fix  a  starting-point  xn  e  N ,  and  £n  e  T  N  ®  T  N  . 

u  u  M-o  M-o 

3.1  Basic  Notations 

3.1  .a  The  Deterministic  Flow  and  Its  Derivatives 

{ <|>t,  t  >  0}  is  the  flow  of  the  vector  field  q,  and  xf  =  <|)  (xQ)  .  We  assume  non-explosion,  so  <| )f:N  — >  N 

2 

is  a  C  diffeomorphism  for  each  t.  The  flow  {§t,  t  >  0}  induces  a  two-parameter  semigroup: 

x*  =  c/  (<|)f  •  (frj1)  (xs)  e  L(TxN-JxN)  .  (26) 

In  local  coordinates,  we  compute  x*  asapxp  matrix,  given  by 

x*  =  exp{^D^(xu)du}  .  (27) 

3.1  .b  Push-Forward  of  the  Diffusion  Variance  Semi-definite  Metric 

For  any  vector  field  C,  on  N,  and  any  differentiable  map  (| ):N  — >  P  into  a  manifold  P,  the  "push-for¬ 
ward"  (])*£  takes  the  value  cf( |)  •  C,  (x)  e  T^P  at  y  =  (|)  (x)  e  P;  likewise  (|)*  ®  £')  =c|)*C  ®  (j)*^  • 

The  diffusion  variance  semi-definite  metric  (,|.)  can  be  considered  as  an  element  of  T  N  ®  T  N  . 

xt  xt  xt 

Hence  the  following  two  quantities  are  intrinsic: 

nt^o  +  (0^JM\ds  =  E0  +  J\s°(a.a)  (xs)  (S/ ds  e  TxN  ®  TxN ;  (28) 


Intrinsic  Geometric  Quantities  Associated  with  the  Model 


3.2  Approximate  Intrinsic  Location  Parameter 

Assume  that  X  is  a  diffusion  on  N  with  generator  L  =  t,  +  ^A,  and  random  initial  value  X„,  centered 

3  2 

at  Xq  with  covariance  tensor  EQ .  A  C  function  \\r:N  — >  M  is  given.  We  recall  from  [5]  that  there 
exists  a  vector  in  the  tangent  space  (x  }M  which  supplies  a  coordinate-independent  replacement 

for  the  notion  of  expected  value  of  \|/  (Xs)  .  This  vector,  denoted  /  v  [\(/  (Xs)  ]  ,  is  called  the 

o  x0,  z,0  0 


approximate  intrinsic  location  parameter  (AILP)  of  \|/  (Xs)  in  the  tangent  space  T  ,  .  M  .  We  here 
omit  any  discussion  of  how  the  AILP  is  derived  from  the  study  of  manifold-valued  martingales,  or  its 
relation  to  harmonic  mappings,  but  merely  state  the  formula 


=  ^{Vc/yfxg)  (»8)  +  \|/*  v c/<f)g  (xQ)  (n8)  -|%f8vd(^t(x0)  (dry]} ,  02) 

using  the  notations  of  Section  3.1.  In  the  particular  case  where  \|/  is  the  identity,  we  obtain 
I  y  [Xs]  ,  the  AILP  of  Xs  in  the  tangent  space  T  N,  given  by 

X0,  o  o  X0 

-  ^{Vd<|>8(x0)  (n8)  -jytsvdcyx0)  (dnt)} .  03) 

3.3  Numerical  Evaluation  of  the  Geometric  Quantities  Above 

Suppose  we  have  discretized  the  interval  [0,  8]  .  We  now  explain  how  to  evaluate,  at  consecutive  time 
steps  u  <t ,  the  quantities  {xt,  x*,  mf}  ,  where  ms  =  lx  £  [X§]  .  The  ODE  dxf/dt  =  ^(xf)  can 
be  solved,  for  example,  using  the  scheme 

2  3 

xf  =  xu+  (t-u)5  +  -^^-D5(xu)  (£)  (^®^)+D^(xu)  (D^(xu)^)},  (34) 

where  c,  is  short  for  c,  (x  )  .  Using  consecutive  pairs  (x  xf)  computed  from  (34),  we  can  discretize 
and  solve  (27),  using  the  trapezium  rule: 

\  «  exp  {—y—  [D$  (xu)  +  D5,  (xt)  ]  }  ,  (35) 

where  Tq  =  / ,  and  Tg  =  x*Tg  .  Take  =  Eg,  and  use  (34),  (35),  and  the  trapezium  rule  to  solve: 

E.  =  -  (a  •  a)  (x.)  +Tt  [s  +  ^ -  (a  •  a)  (x  )]  (x* )  .  (36) 

According  to  [5],  the  local  formula  for  mR  =  l  y  [X*]  is 

°  X0’  ° 


m8  =  \  ^K8“T0r(x0^  (^o)  +r(xs)  (S8^  ' 


k5  =  J*t.  ht[D\{xt)  (St)  -  T  (xt)  (aa(xt))]dt. 


The  last  integral  may  be  evaluated  by: 
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K t*^[D25(xt)  (St)  - r (Xt)  (a  •  0 (xf) ) ]  +Ttu{Ku  +  t-^[D2^(xu)  (Eu)  -T(xu)  (a-a(xu))]} 
Finally  (30),  (32),  and  (33)  show  that 

=  ^{D2¥(x8)  (Eg)  -Jr(xg)  (S8)  +f  (ys)  (JSg/) }  +Jms,  (38) 

where  J  =  D\\t  (xg)  .  We  compute  Vc/< |)g  (xQ)  in  a  similar  way,  using  (31 ). 

4  The  Fundamental  Theorems 

Throughout  this  section,  X  is  a  diffusion  on  N  with  generator  /_  =  £+  ^ A,  and  random  initial  value 
XQ,  centered  at  xQ  with  covariance  tensor  EQ  .  We  are  given  a  C  function  \| r.N  — >  M ,  where  M  is  a 
Riemannian  manifold  of  dimension  q.  Let  (3  (y)  e  TyM  ®  TyM  be  the  inverse  metric  tensor  at  y  e  M , 
which  can  be  interpreted  as  an  observation  covariance  metric  as  in  Section  2.6.  Consider  a  single 
observation  Yj  of  the  form: 

Yl^exPV(X6)Vle  M' 

where  Vj  is  a  mean-zero  random  vector  in  (M  ,  with  covariance  tensor  (3  (y)  when 

V  (X§)  =  y,  but  which  is  otherwise  independent  of  UQ  and  of  the  Wiener  process  W. 

4.1  Orders  of  Magnitude  of  Noise  Terms 

We  shall  suppose  that,  for  some  small  parameter  y,  the  matrices  for  a  =  a  •  a  (see  (3))  and  (3  (see 
Section  2.6)  satisfy 

a(xt)  =  y2a0(xf)  ,  0<f  <8;  (3(\(/(x8))  =  y2(3Q (y (x8) )  ;  (39) 

where  aQ  is  some  other  semi-definite  metric,  and  (3Q  another  metric.  Also  assume  that,  with  respect 
to  the  metric  g  appearing  in  (4),  the  distribution  of  L/Q  =  expx1  (XQ)  satisfies: 

||E[U0]||  -  O  (y4)  ,  p0||  —  ||Var((J0)  ||  -  0(y2),  ||E  [T(U0,  UQ>  UQ)  ]  ||  -  0(y4),  (40) 

for  arbitrary  tensor  fields  T  of  type  (1,3),  whose  norm  is  1 . 

4.2  Definition 

Let  U  and  Z  be  integrable  random  variables  in  Rp ,  and  Y  a  random  variable  in  Rq  .  We  shall  say  that 

4 

Z  approximates  E  [U \  Y]  uptoO(y)  if 


E  [b  (Y  -  E  [Y] )  ®  (Z-U)]  =  O  (y  )  , 

for  every  h  e  C1  (Rq-,RP)  with  max  {sup  ||h  (y)  ||,  sup  ||Dh  (y)  ||}  =  1. 


(41) 
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4.3  Interpretation  of  Tensors 

To  understand  formulas  such  as  (45)  below,  note  that  (3  (y)  e  TyM  ©  TyM  can  be  interpreted  as  an 

element  of /.  (T  *M;T  M)  ,  as  in  (3).  For  yg  =  \)/ (xg)  ,  the  adjoint  of  J  =  D\(/ (xg)  e /.  (T  N;T  M)  is 
T  Y  Y  y  x8  Yg 

written  J  e  L  (T  *M;T  *N)  ,  and  consequently  ZJ  e  L  (T  *M;T  N)  .  For  the  convenience  of 

Yg  xs  0  Yg  xa 

users,  (45)  is  expressed  as  a  matrix  product,  but  it  actually  represents  a  vector  bundle  morphism. 

We  quote  the  main  result  of  [6]. 

4.4  Theorem  (Intrinsic  Conditional  Expectation  Formula) 

Consider  the  random  vector  Ug  ©  Zg  e  Tx  N  ©  (x  >  M ,  where  y§  =  V  (xg) ,  given  by 

US  =  exp^1  (X8)  ,  Z8  =  exp^1  (Y, )  .  (42) 

(i)  Under  the  assumptions  (39)  and  (40),  the  joint  distribution  of  Ug  and  Zg  satisfies 


'x  E  tX8] 

Xq,  O 

/x  e  rv(Xg)] 
_  x0’  ^0  ° 


+  O  (y4)  ; 


(43) 


in  terms  of  the  approximate  intrinsic  location  parameters  of  Section  3.2,  where  Sg  is  given  by  (29). 

4 

(ii)  E  [UglZg]  is  approximated  up  to  O  (y  )  (in  the  sense  of  (41 ))  by 


I  L  [Xg]  +  GZS  +  P  (Zs  ®  Zs)  -  E  [p  (Z8  ®  Zs)  ]  ,  (44) 

x0’  ° 

where  Z§  =  Z,-I  y  [f  (X4]  ,  and  Ge  L(T  M;T  N )  is  analogous  to  the  Kalman  gain,  namely 

°  x0’  ^0  °  Yg  x  5 

G  =  SgJ7[JEg/+p(ys)]  \  (45) 

where  J  =  Dw  (xs)  ,  ancf  p  e  L(T  M®T  M;T  N)  satisfies 

0  Ys  Yg  xg 

p(z®z)  =  [/  -  GJ]  Vd(|)g  (xQ)  (TgGz®XgGz)  -GVd\|/(x§)  (Gz®Gz)}  ,  (46) 


E[p(Z§®Zs)]  =  p(GJSg)  +0(y4)  . 

4 

(Hi)  Var(L/g|Zg)  is  approximated  up  to  O  (y  )  by  (l-GJ)Z^. 

(iv)  If  U§  denotes  the  difference  between  U5  and  (44),  and  if  T  is  a  tensor  field  of  type  (1,3)  on  N  of 
norm  1 ,  then 


E  [T (l/8,  Us,  Us)]  =  0(y4)  . 


(47) 
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4.5  Computation  of  Exponential  Barycenters 

Let  us  recall  from  Emery  and  Mokobodzki  [1 0]  that  an  exponential  barycenter  for  a  random  variable 
S  on  a  manifold  N  with  a  torsion-free  connection  T  is  a  point  z  e  N  such  that  the  random  variable 
expj  (S)  e  TzN  has  mean  zero.  Suppose  that  are  given  a  point  xe  N ,  and  moments 

E  [exp  1  (S)  ]  e  T  N ,  E  =  Varfexp  1  (S))  e  T  N  ®  T  N  .  (48) 


We  would  like  to  compute  from  these  moments  an  exponential  barycenter  for  S,  or  at  least  a  "good" 
approximation.  We  quote  a  result  of  Darling  [6].  The  norm  ||  .  ||  is  with  respect  to  some  reference 
metric  for  N,  which  need  not  be  related  to  the  connection.  Given  the  curvature  tensor  (1 9),  the  vector 
field  1  s  denoted 

4.6  Theorem  (Exponential  Barycenter  Formula) 


Suppose  that  the  moments  (48)  satisfy: 


O  (y) ,  ||E||  =  O  (y  ) ,  for  a  small  number  y.  Define 


z  =  expx  { ft  -  ^  £  RijkV-'I'k}  ■  <49) 

i.j.k 

-1  4 

Then  E  [expz  (S)  ]  =  O  (y  );  in  other  words,  z  is  an  approximate  exponential  barycenter  for  S.  If  T 

4 

is  a  tensor  field  of  type  (1,3),  and  if  ||  E  [T  (t)  -  p,  r|  -  p,  r|  -  p)  ]  ||  =  O  ( y  ) ,  where 
r|  =  exp^^1  (S)  e  then 

||E[T(expz1(S),expz1(S),expz1(S))]||  =  0(y4).  (50) 

4.7  Remark  on  the  Validity  of  Recursion 

Note  carefully  the  relationship  between  the  results  (47)  and  (50),  and  the  assumption  (40).  Assume 
that  (40)  holds.  The  conditional  law  of  X§(  given  Yj ,  will  be  represented  by  a  random  variable  X5  on 
N,  whose  exponential  barycenter  z  is  computed  according  to  (49),  where  p  is  computed  from  (44), 
and  E  is  (/-  GJ)  Eg  .  By  (47)  and  (50),  the  random  variable  expz1  (X§)  in  T_,N  will  satisfy  the  same 
conditions  that  U0  satisfied  in  (40).  Therefore  the  algorithm  can  be  repeated  on  the  next  time  inter¬ 
val  [8,  28]  ,  at  the  end  of  which  we  receive  another  observation  Y-, ,  etc. 

5  Gl  Filter  Algorithm 

As  before,  the  state  process  X  is  a  diffusion  on  N  with  generator  L  =  ^  ,  and  random  initial  value 

X0  centered  at  p.Q  with  covariance  tensor  E0  .  A  C  function  \| r.N  — >  M  is  given. 

5.1  Discrete-time  Observations 

At  each  of  the  discrete  times  0  <  tj  <  t,  <  . . . ,  we  make  an  observation  Yn  of  the  form: 


Yn  =  eXPV(Xf(n),VneM' 
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where  V  is  a  mean-zero  random  vector  in  T  IY  ,  with  covariance  tensor  (3  (y)  when 

■  v  t (n) ' 

\|/  (Xf  (n) )  =  y ,  but  which  is  otherwise  independent  of  X.  Define  a  sequence  of  sigma-fields 


3  sa{Y,,  ....  Y  }  . 

n  L  1  nJ 


At  time  t  ,  we  wish  to  compute  an  3n  -measurable  random  variable  with  values  in  N,  and  an  3n  - 

-  Y 

measurable  random  variable  En  e  T,  N  ®Tr  N ,  such  that,  conditional  on  3  ,  X  is  approximately 

4  "r>  Fn  _  n  ln 

(i.e.  up  to  O  (y  )  )  centered  at  pn  with  covariance  tensor  En  . 

For  any  n  >  1 ,  suppose  that  Yp  Yn _  1  have  been  observed,  from  which  we  have  calculated  at 
time  t  _  j  a  state  estimate  pn  j  and  its  associated  covariance  tensor  En  _  1 .  The  Gl  Filter  update 
formula  computes  p.  ,  and  En  as  described  in  Sections  5.2  -  5.5. 


5.2  Precomputation 

First  we  carry  out  all  the  computations  described  in  Section  3.3,  starting  from  xQ  =  pn  _  j  and 
Eq  =  En  _  1  .The  time  interval  [t  p  t  ]  is  here  represented  as  the  time  interval  [0,  8]  ,  so  when  we 
refer  to  xg,  Eg ,  etc.,  we  are  really  referring  to  quantities  at  time  t  .  The  size  of  the  computation 
depends  on  the  number  of  sub-intervals  into  which  we  divide  [0,  8]  ,  which  can  be  as  low  as  1 .  Thus 
we  obtain  numerical  expressions  for  all  of  the  following: 

x8’  ^8’  Vc,v(/  ’  ~S’  V^8  (xo)  •  /x(),  £0  ’  ^x(),  £0  [V  ^ 

0  8 

as  well  as  Tg=  (tq)  .  From  these  we  compute  the  important  coefficients  G  and  p,  defined  in  (45) 
and  (46),  respectively. 


5.3  Data  Assimilation 

All  the  formulas  in  this  section  are  based  on  Theorem  4.4.  We  pull  our  new  observation  Yn  back  into 
the  tangent  space  T  M  ,  by  defining 

y  § 

ZS=  eXPyl  (yn)  /  (51) 


^§  =  ZS-/X  V  [V|/  (Xs)  ]  .  (52) 

o  X0,  Z.Q  o 

See  (58)  for  a  simple  formula  for  Zg .  In  effect,  Zg  is  the  "innovation",  since  it  is  the  difference 

4 

between  the  pulled-back  observation  Zg  and  its  expected  value,  up  to  O  (y  )  .  Next  compute  the 
approximate  conditional  expectation  of  Ug=expx1  (Xg)  ,  given  Yn,  namely 

M  =  1  £  [Xg]  +  GZs  +  p  (Zg  ®  Zs)  -  p  (GJSg)  e  T  N  .  (53) 

Note  that  p  is  non-zero  when  any  kind  of  non-linearity  is  present,  so  p  is  a  quadratic  function  of  the 
innovation,  not  a  linear  one  (as  occurs  in  the  Extended  Kalman  Filter,  for  example).  The  approximate 
conditional  variance  of  Ug  =  expx1  (Xg)  ,  given  Yn,  is 
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£=  (l-GJ)Z,e  T  N  ®T  N. 

0  xs  Xc 


5.4  Update  of  the  State  Estimate 

Recall  that  the  canonical  sub-Riemannian  connection  V°  on  N  induces  a  curvature  tensor,  as  in  ( 1 9), 
and  the  vector  field  Rl  is  denoted  . 

Define  the  new  state  estimate  p.n  by  the  formula  of  Theorem  4.6  for  the  conditional  exponential 
barycenter  of  Xg/  given  Y  namely 

An  =  expx6  R;jk^k}  6  N  ■  (55) 

i.j.k 

See  (59)  for  a  simple  formula  for  (55).  Finally  tn  e  T,  N  ®  T.  N  is  the  push-forward  of 

E  e  T  N  ®  T  N  along  the  geodesic  flow;  see  (60)  for  a  straightforward  method  to  compute  En  . 

8  8 

5.5  Computation  of  Exponential  Maps  and  Inverse  Exponential  Maps 

Computation  of  expx  (.)  in  (55),  exp  1  (.)  in  (51 ),  and  En  e  T.  N  ®  T.  N ,  involves  solving  the 
first-order  ordinary  differential  equations  for  the  geodesic  flows  on  the  tangent  bundles  TN  and  TM 
respectively,  which  we  describe  briefly  in  Section  5.6.  However  there  are  also  "single  step"  versions, 
which  most  practitioners  will  prefer  to  use,  and  which  depend  on  the  following  classical  formulas  of 
local  differential  geometry,  proved  for  example  in  Darling  [6]: 

5. 5.  a  Expansion  of  the  Exponential  Map 

Forve  TxN  =  Rp, 

exp  v  =  x  +  v-^r(x)  (v®v)  +7[2r(x)  (r(x)  (v®v)  ®v)  -  DT  (x)  (v)  (v®v)]  +0(||v||4)  .(56) 
x  2  6 

5.5.  b  Expansion  of  the  Inverse  Exponential  Map 

The  expansion  for  exp^1  (z)  ,  taking  w=z-y,  is: 

w  +  (y)  (w  ®  w)  +7  [Dr  (y)  (w)  (w  ®  w)  +  F  (y)  (T  (y)  (w  ®  w)  ®  w)  ]  +0  (||w||4)  .  (57) 

2  o 

5.5. c  Application 

A  simple  way  to  approximate  (51 ),  avoiding  use  of  the  derivative  of  the  connector,  is: 


z8-Yn-ys  +  5f(x8)((Yn-x8)®  (yn-y8))- 


Since  we  have  to  differentiate  T  ( .)  in  any  case  to  evaluate  ( 1 9),  take 


3.4-  ‘Ik 
k 


(59) 
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5.6  Geodesic  Flow 

This  section  is  for  those  who  seek  more  accurate  calculations  than  the  ones  described  in  Section  5.5. 
The  geodesic  equation  on  N  can  be  represented  as  a  first  order  ODE  on  the  tangent  bundle  TN ;  in 
local  coordinates,  a  solution  is  given  by  the  geodesic  flow 

Y(s)  1  =  K  (  ["7(0)11  =  ^(Y(0),C(0)) 

_C(s)J  sl  Lc  (0)J  J  /(T  (0)^(0)) 

in  RP  ©  RP  ,  satisfying  the  system  of  ODE 


=  My,  Q  =  ^ 

L-r  (y)  (C®0 


(62) 


(See  Sakai  [16],  p.  56.)  For  example,  to  compute  y(1)  =  exp  (v)  ,  the  initial  conditions  will  be 

xs 

Y(0)  =Xg,^(0)  =  v  .  Here  (y(s),0<s<1}  will  be  a  geodesic  on  N.  In  order  to  push  a  tensor 
forward  from  T  (Q)  N  to  T^  ( ^  N ,  i.e.  to  compute  (7^  )  *:T^  ~*T  ,  we  must  compute  the 

derivative  flow  {F(s),0<s<l}  in  L  (Rp  ®  Rp  :RP  ®  Rp)  satisfying 

F'  =  Dh(y,QF,  F(0)  =  /.  (63) 

Fn  F,,  i~i 

We  partition  F  =  '  intopxp  matrices;  then  (jtj )  *  =  Fn(l). 

Ml  M2 

6  Distinctive  Features  of  the  Gl  Filter 


6.1  Invariance  Under  Change  of  Coordinate  Systems 

All  the  formulas  for  the  Gl  Filter  come  from  Theorems  4.4  and  4.6.  All  the  mathematical  quantities 
occurring  in  these  two  theorems  are  tensorial,  i.e.  are  definable  without  using  coordinates,  and  hence 
have  the  same  intrinsic  meaning  for  all  coordinate  systems.  The  only  differences  between  computa¬ 
tions  in  different  coordinate  systems  will  arise  from  numerical  errors  resulting  from  discretization, 
which  can  be  made  as  small  as  desired. 
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6.2  Consistency  with  the  Kalman  Filter  Under  Nonlinear  Transformations 

Suppose  that  {X  t  >  0}  is  an  Ornstein-Uhlenbeck  process  satisfying  the  SDE 

dXt  =  An  _  jXt dt  +  an_ldWt,tn_l<t<tn;  (64) 

Y 

here  A  j  and  a  j  are  3n  j -measurable  p  x  p  matrices.  Also  suppose  that 

Y  =  J  ,Xt  +  V  ,  n  =  1,2,  ..., 

n  n  -  1  t  n  '  ' 

y 

where  Vn~  Nq(0,  Bnl)  ,  Jni  and  Bn  _j  are  3n-1 -measurable  matrices,  and  the  {Vn}  are 
mutually  independent  random  variables,  independent  of  W.  It  is  well  known  that  the  conditional  distri¬ 
bution  of  Xf  given  3^  is  Gaussian,  with  conditional  mean  (1°  and  variance  En  given  recursively  by 
the  Kalman  Filter. 

6. 2. a  Proposition 

Suppose  <|) :  — >  Rp  and  Q\Rq  — >  Rq  are  any  C~  diffeomorphisms.  When  the  Gl  Filter  is  applied  to  the 
process  { <| )  (Xt) ,  t  >  0}  ,  with  observations  0  ( Yj) ,  0  ( Y2) ,  . . .  at  times  1 1  <  t2  <  ■ . . ,  the  state  estimator 
of  (])  (Xf  ) ,  given  0  ( Yj ) ,  ....  0  (Y  )  ,  is  (])  ( A°)  /  w/th  conditional  covariance  tensor 

D(h(A°)E°(D(h(A°))7. 

The  theorem  says,  in  effect,  that  when  a  nonlinear  system  is  a  transformed  version  of  a  linear  system, 
then  the  Gl  Filter  estimates  are  similarly  transformed  versions  of  the  Kalman  Filter  estimates,  as  we 
desired  in  Section  1.3.c. 

Proof:  Since  every  step  in  the  Gl  Filter  is  coordinate-independent,  it  suffices  to  prove  the  theorem 
when  A  and  0  are  both  the  identity.  When  {Xt,  t  >  0}  satisfies  (64),  then  (1 )  holds  with  a  (x)  not 
depending  on  x,  and  b  (x)  is  of  the  form  Ax ,  where  A  stands  for  An  1  when  te  [t  r  t  ]  .  The 
connector  T  is  zero  on  N  =  RP ,  so  \  (x)  =  Ax ,  and  (x)  is  zero. 

Likewise  since  the  covariance  tensor  is  constant  on  observation  space,  the  connector  T  is  zero,  and 

2 

since  \| r  (x)  =  Jn  ^x  is  linear,  we  have  D  \\t  =  0  .  We  see  that  I ,  ^  [X§]  =  0  in  (37)  and 

I ,  .  [t(t(Xg)]  =  0  in  (38).  Abbreviating  J  8  ,  to  J,  8  ,  etc.,  (45)  becomes 

P-n  -  1>  Ai  -  1 

G  =  Z/[JZ/  +B]~\ 

and  p  =  0  in  (46).  In  the  constant-metric  case,  expxv  =  x  +  v  .  From  (53)  and  (55), 

An  =*5  +  G(Yn-ys),En  =  (l-GJ)Z  §, 


which  are  the  Kalman  Filter  estimates. 
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6.3  Relationship  of  the  Gl  Filter  with  Standard  Second-order  Filters 

Suppose  that  there  is  a  coordinate  system  on  state  space  in  which  the  local  connector  r  (.)  ,  defined 
in  (6),  is  zero,  and  a  coordinate  system  on  observation  space  in  which  r  (.)  (the  Levi-Civita  connector 

for  M)  is  zero.  This  situation  occurs,  for  example,  if  a  (x)  does  not  depend  on  x,  and  (3  (y)  does  not 

2  2 

depend  on  y.  The  remaining  nonlinearities  come  from  D“^,  which  need  not  be  zero,  and  from  D"\\i . 
The  formula  (45)  for  G  is  comparable  to  standard  formulas,  but  p  in  (46),  and  the  AILPs  in  (37)  and 
(38)  will  be  non-zero.  The  update  formulas  (53)  and  (55)  become 

£n=x8  +  /x0,£0[Xs]  +GZ§  +  p  (ZS®Z8)  -p(GJZs)  , 

^-Yn-ys-/XoZo[¥(X8)]. 

This  quadratic  formula  is  unlike  the  linear  formulas  found  in  the  continuous-discrete  Extended  Kal¬ 
man  Filter  (see  [1  1  ],  Theorem  8.1,  p.  278),  and  the  Truncated  and  Gaussian  Second-order  Filters 
([1 1],  equation  (9.40),  p.  345). 

6.4  Size  of  Filtering  Errors 

We  have  said  nothing  about  whether  the  observation  function  \|/  has  properties  (such  as  the  rank  of  its 
derivative)  sufficient  to  prevent  filtering  errors  from  diverging.  See  Picard  [15]  fora  rigorous  discussion 
of  this  point  for  a  certain  class  of  filters,  under  additional  assumptions. 

7  Software  Implementation:  Statistical  Results 

MATLAB  codes  for  the  Gl  filter  and  for  a  continuous-discrete  Extended  Kalman  Filter  have  been 
developed  for  the  tracking  problem  described  in  Sections  2.4  and  2.7.  Although  the  Gl  Filter  shows 
reduced  bias,  this  example  is  not  well  suited  to  a  statistical  comparison,  since  results  depend  on  many 
parameters,  on  the  control  law  for  the  tracker,  and  on  the  coordinate  systems  chosen  for  the  EKF. 
Moreover  in  a  nine-dimensional  state  space,  computations  are  relatively  slow  for  both  filters. 

Consequently  a  much  simpler  example  was  selected  as  a  context  for  statistical  comparison.  Here  both 
the  state  process  and  the  observations  are  one-dimensional,  and  the  noise  variance  and  observation 
variance  do  not  vary,  which  forces  both  the  local  connectors  T  (.)  and  T(.)  to  be  zero.  The  model 
is: 


dXf  =  ^(Xt)dt+  JadWt,  t>  0, 


(65) 


Yn  =  \|/(Xn) +VpVn,  n  =  1,2, ....,  (66) 

3  2 

where  c,  (x)  =-x  /2  and  \|/(x)  =x/  (p  +  x  )  .  If  one  were  using  such  a  simple  model  in  real  life, 
direct  calculation  of  the  conditional  density  would  be  a  natural  approach  in  practice.  The  Gl  Filter  and 
EKF  were  programmed  merely  for  the  sake  of  statistical  comparison. 
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Note  that  \|/  has  critical  points  at  ±Jp  ,  at  which  any  filter  is  bound  to  perform  badly.  We  chose 
paramefers  p  =  0.1 ,  a  =  0.01 ,  and  P  =  0.001 ,  which  cause  the  state  process,  which  is  a  positive  recur¬ 
rent  diffusion,  to  visit  the  critical  points  fairly  often.  Since  the  model  is  stationary,  statistical  character¬ 
istics  of  a  filter  may  be  observed  by  simply  running  the  filter  over  thousands  of  cycles.  Fortunately  all 
the  formulas  defined  in  Section  3  can  be  computed  analytically,  without  recourse  to  numerical  inte¬ 
gration.  For  example,  xt  =  xQ/  J 1  +  Xgt,  and  the  AILP  of  Xg  is  given  by 


mS  = 


f  a  -4 

*  -1-  V 
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Histograms  for  the  absolute  value  of  the  filter  error  over  1 0,000  filter  cycles  are  shown.  To  preserve 


Filter  Error  Histogram:  Geometrically  Intrinsic  Filter  Filter  Error  Histogram:  Extended  Kalman  Filter 


stability  of  the  Gl  Filter,  we  placed  a  "collar"  over  the  second  order  term  p  (Zg  ©  Z§  -  GJEg)  in  (53) 
so  that  its  magnitude  could  not  exceed  that  of  the  first  order  term.  Note  that  large  filtering  errors 
occur  less  frequently  for  the  GIF  than  for  the  EKF.  This  reflects  the  fact  that,  when  the  Gl  Filter  is 
"thrown  off"  by  the  nonlinearity  of  the  observation  function  \|/,  it  recovers  more  quickly  than  the  EKF 
does.  An  example  is  shown  in  the  following  time  series  of  1 00  filter  cycles,  using  the  same  X  series. 


True  State:  Geometrically  Intrinsic  Filter, True  State:  Extended  Kalman  Filter, 


Conclusion 
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It  should  be  emphasized  that  the  parameter  values  chosen  represent  an  extreme  regime  at  the  very 
edge  of  instability,  and  that  for  a  large  range  of  parameter  values,  for  example  for  p  >  4  and  small  a 
and  (3,  the  Gl  Filter  and  the  EKF  perform  about  the  same.  Moreover  when  the  second  order  term 
p  (ZS  ©  Z §  -  GJZ g)  is  deleted,  the  performance  advantage  of  the  Gl  Filter  disappears. 

8  Conclusion 

This  project  has  demonstrated  that  there  is  a  natural  intrinsic  generalization  of  the  Kalman  Filter  to 
the  fully  nonlinear  context,  and  that  it  is  possible  to  implement  it  computationally.  Unlike  the  Extended 
Kalman  Filter,  the  state  estimate  for  the  Gl  Filter  is  a  quadratic,  not  a  linear,  function  of  the  observa¬ 
tions.  Computational  experiments  show  that,  when  the  observation  function  is  highly  nonlinear,  there 
exist  choices  of  the  noise  parameters  at  which  the  Gl  Filter  significantly  outperforms  the  EKF. 
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